Acoustic-to-Articulatory Inversion Mapping Based on Latent Trajectory Gaussian Mixture Model
نویسندگان
چکیده
A maximum likelihood parameter trajectory estimation based on a Gaussian mixture model (GMM) has been successfully implemented for acoustic-to-articulatory inversion mapping. In the conventional method, GMM parameters are optimized by maximizing a likelihood function for joint static and dynamic features of acoustic-articulatory data, and then, the articulatory parameter trajectories are estimated for given the acoustic data by maximizing a likelihood function for only the static features, imposing a constraint between static and dynamic features to consider the inter-frame correlation. Due to the inconsistency of the training and mapping criterion, the trained GMM is not optimum for the mapping process. This inconsistency problem is addressed within a trajectory training framework, but it becomes more difficult to optimize some parameters, e.g., covariance matrices and mixture component sequences. In this paper, we propose an inversion mapping method based on a latent trajectory GMM (LT-GMM) as yet another way to overcome the inconsistency issue. The proposed method makes it possible to use a well-formulated algorithm, such as EM algorithm, to optimize the LT-GMM parameters, which is not feasible in the traditional trajectory training. Experimental results demonstrate that the proposed method yields higher accuracy in the inversion mapping compared to the conventional GMM-based method.
منابع مشابه
Acoustic - to - Articulatory Inversion Mapping with Variational Latent Trajectory Gaussian Mixture Model ∗
متن کامل
Statistical mapping between articulatory movements and acoustic spectrum using a Gaussian mixture model
In this paper, we describe a statistical approach to both an articulatory-to-acoustic mapping and an acoustic-to-articulatory inversion mapping without using phonetic information. The joint probability density of an articulatory parameter and an acoustic parameter is modeled using a Gaussian mixture model (GMM) based on a parallel acoustic-articulatory speech database. We apply the GMM-based ma...
متن کاملSpeaker adaptation of an acoustic-to-articulatory inversion model using cascaded Gaussian mixture regressions
The article presents a method for adapting a GMM-based acoustic-articulatory inversion model trained on a reference speaker to another speaker. The goal is to estimate the articulatory trajectories in the geometrical space of a reference speaker from the speech audio signal of another speaker. This method is developed in the context of a system of visual biofeedback, aimed at pronunciation trai...
متن کاملSpeaker adaptation of an acoustic-articulatory inversion model using cascaded Gaussian mixture regressions
The article presents a method for adapting a GMM-based acoustic-articulatory inversion model trained on a reference speaker to another speaker. The goal is to estimate the articulatory trajectories in the geometrical space of a reference speaker from the speech audio signal of another speaker. This method is developed in the context of a system of visual biofeedback, aimed at pronunciation trai...
متن کاملOn smoothing articulatory trajectories obtained from Gaussian mixture model based acoustic-to-articulatory inversion.
It is well-known that the performance of acoustic-to-articulatory inversion improves by smoothing the articulatory trajectories estimated using Gaussian mixture model (GMM) mapping (denoted by GMM + Smoothing). GMM + Smoothing also provides similar performance with GMM mapping using dynamic features, which integrates smoothing directly in the mapping criterion. Due to the separation between smo...
متن کامل